Single MCMC chain parallelisation on decision trees

نویسندگان

چکیده

Abstract Decision trees (DT) are highly famous in machine learning and usually acquire state-of-the-art performance. Despite that, well-known variants like CART, ID3, random forest, boosted miss a probabilistic version that encodes prior assumptions about tree structures shares statistical strength between node parameters. Existing work on Bayesian DT depends Markov Chain Monte Carlo (MCMC), which can be computationally slow, especially high dimensional data expensive proposals. In this study, we propose method to parallelise single MCMC chain an average laptop or personal computer enables us reduce its run-time through multi-core processing while the results statistically identical conventional sequential implementation. We also calculate theoretical practical reduction run time, obtained utilising our multi-processor architectures. Experiments showed could achieve 18 times faster running time provided serial parallel implementation identical.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Quantum Decision Trees

Quantum decision systems are being increasingly considered for use in artificial intelligence applications. Classical and quantum nodes can be distinguished based on certain correlations in their states. This paper investigates some properties of the states obtained in a decision tree structure. How these correlations may be mapped to the decision tree is considered. Classical tree representati...

متن کامل

Phylogenetic MCMC algorithms are misleading on mixtures of trees.

Markov chain Monte Carlo (MCMC) algorithms play a critical role in the Bayesian approach to phylogenetic inference. We present a theoretical analysis of the rate of convergence of many of the widely used Markov chains. For N characters generated from a uniform mixture of two trees, we prove that the Markov chains take an exponentially long (in N) number of iterations to converge to the posterio...

متن کامل

Parallelizing MCMC with Random Partition Trees

The modern scale of data has brought new challenges to Bayesian inference. In particular, conventional MCMC algorithms are computationally very expensive for large data sets. A promising approach to solve this problem is embarrassingly parallel MCMC (EP-MCMC), which first partitions the data into multiple subsets and runs independent sampling algorithms on each subset. The subset posterior draw...

متن کامل

Decision Trees on Parallel Processors

A framework for induction of decision trees suitable for implementation on shared-and distributed-memory multiprocessors or networks of workstations is described. The approach , called Parallel Decision Trees (PDT), overcomes limitations of equivalent serial algorithms that have been reported by several researchers, and enables the use of the very-large-scale training sets that are increasingly...

متن کامل

Constructive Induction On Decision Trees

Selective induction techniques perform poorly when the features are inappropriate for the target concept. One solution is to have the learning system construct new features automatically ; unfortunately feature construction is a difficult and poorly understood problem. In this paper we present a definition of feature construction in concept learning, and offer a framework for its study based on...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Annals of Mathematics and Artificial Intelligence

سال: 2023

ISSN: ['1573-7470', '1012-2443']

DOI: https://doi.org/10.1007/s10472-023-09876-9